pico_8fandomcom-20200213-history
CPU
PICO-8 keeps track of CPU usage using two values: Lua cycles and system cycles. Most operations affect Lua cycles, but some functions have an additional system cycle cost. There are 4194304 cycles in a second (2^22), so about 69905 cycles per frame at 60 FPS, or 139810 cycles at 30 FPS. The function call stat(1) returns the total (Lua + system) cycle ratio for the current frame, and stat(2) returns the system ratio. Example: since cls() uses 2 Lua cycles and 1024 system cycles, PICO-8 running at 60 FPS can call cls() about 2^22/60/1026 = 68 times. Optimization Tips: Some tips for when your code isn't running fast enough: (these will increase your code's size and reduce its clarity, however - it's a trade-off) * First, make sure you know why your code is running slow - which part's costing the most time? Use time() or stat(1) calls to measure this, or just delete blocks of code to see where the problem lies. * Focus on just the code causing the most slowdown (usually a while/for loop), and only until the desired speed is achieved, as optimizing your whole code will quickly run you out of tokens for no actual gain. (Often, 99% of the time is spent in 1% of the code. Optimizing the rest of the code is pointless). * Having a stat(1) printh before the end of _update & _draw (or before the flip) that will show you how your game's actual performance is improving (or not) as you're making optimizations is invaluable here. * If doing an optimization doesn't seem to help actual performance (as measured by the above point's stat(1)), you've probably failed to find the actual problem point, try spending more time on that. Now that you found the code causing the slowdown: * You can always remove it if it's not essential. That's one of the only optimizations that will improve your code size and clarity, too! * Forget about the code for a moment and consider what it's supposed to be doing - what's the fastest way that can be implemented? Can a clever algorithm or data structure be used to avoid pointless calculation? * For example, pico has a fair(ish) amount of lua memory - 2 MB - a function that has a small (or sometimes not-so-small) set of possible inputs and does slow computations on them can often be replaced with a lookup table (which could be computed at startup time, if too large to fit in the code). * Now onto the micro-optimizations: * Function calls cost, so inlining short calls (replacing the calls with the code inside the function) can help performance (in exchange for severely harming code size and clarity - use with care). * Access to global or non-local variables (locals from other functions) is slower than access to local variables - use local variables instead, if possible. If a global or non-local variable is read multiple times, it'd save cycles to cache it in a local variable first (this helps a bit even if the variable's read twice). Lua cycles Some standard Lua operation costs: (tested on 0.1.12c) * Access to global variables or to local variables from another function: 1 cycle per access. (Only local variables within same function avoid this cost) * Assignment statement: minimum 1 cycle per assignment, does not combine with other costs. (E.g. 'a=b' and 'a=b+c' both cost only 1 cycle, 'a,b=b,a' costs 2) * Regular arithmetic operators: 1 cycle (note that x''^y is excluded - has an additional varying system cost) * Logical operators: for a chain of N and/or operators, of which K are evaluated (not short-circuited), the cost is K+1 cycles. ('not' is 1 cycle). (E.g. 'a and b and c' costs 1, 2, or 3, depending on how many are evaluated) * Relational operators: if directly inside an if/while condition: 0 cycle. If directly inside a logical operator: 1 cycle total for all relational operators within that chain of logical operators. If directly inside something else: 2 cycles. (E.g. 'a b c' and, 'a a and b b and c c' both cost 4 cycles) * String concatenation: 3 cycles * Table element access: 1 cycle * Table construction: 1+n+L cycles, where n is the number of elements, and L is 1 if some of the elements are list-style (without an explicit key) or otherwise 0. (E.g. {a,b} is 4 cycles, but {1=a,2=b} is 3 cycles. Funny) * Function construction: 1 cycle, or 2 cycle if it captures any local or non-local variables. * Function call: 2+n cycles, where n is the number of arguments. This is regardless of whether the function is accessed through a local or a global. * Function return: 1+n cycles, where n is the number of return values. This also applies to the implicit "return" at the end of all user-defined functions (which costs 1 cycle). * If statement: roughly 1 cycle per evaluated if/elseif. * While loop: roughly 1 cycle per iteration. * Numeric for loop: 5+n, where n is the number of iterations. * '''do … end': 0 cycles * Metamethod access: 0 cycles (doesn't include cost of the metamethod itself) Lua CPU stats are only updated every 1024 cycles. Functions that add negative Lua cycles Some functions have negative Lua cycles associated with them that get subtracted from the Lua cycle count by the PICO-8 runtime. This mechanism allows PICO-8 to make these functions artificially cheaper. For instance, shl(x,y) should cost 4 cycles because it is a function call with two arguments, but each call subtracts 3 cycles from the Lua cycle counter, for a total of 1 cycle. The table below lists functions that have their total cost tweaked in this way. Functions that add Lua cycles A few functions consume additional Lua cycles (in addition to the standard cost of 2+(#arguments)): (Measured on PICO-8 1.1.12d RC10) The following functions neither add nor subtract cycles, and cost the standard amount: sgn(), abs(), sin(), cos(), atan2(). min(), max(), mid(). camera(), clip(), cursor(), fillp(), pal(), palt(). fget(), fset(), mget(), mset(), pget(), pset(), sget(), sset(). cocreate(), coresume(), costatus(), dget(), dset(), time(), type(). getmetatable(), setmetatable(), pairs(), next(), rawget(), rawset(). sub(), tonum(). System cycles A few functions consume system cycles. Note that they will add to their standard Lua cycle cost. System CPU stats are updated after each call. Here is the list, most measured on PICO-8 1.1.11g: